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espirical findings which hate been reported in conjunction with 
Elkind«s (1967) conjecture that Piaget's conservation probless tap 
two distinct concepts. The discrepant findings which report on the 
order of eaergence of identity conservation and equivalence 
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the conflicting studies reveal that the discrepant findings are 
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This brief review is concerned with some ostensibly conflicting empir- 
ical findings which have been reported in conjunction with Elkfnd's (1967) 
conjecture that Piaget's conservation problems tap two distinct concepts. 
In preparing this review, our principal aim was to determine whether or not 
1t 1s possible to effect a rapprochement between the conflicting groups of 
findings via the consideration of certain measurement Issues associated with 
the assessment of cognitive-developmental sequences. The paper begins with 
an overview of the problem. Next, three procedural variables are discussed 
which are known to affect the visibility of cognitive-developmental sequences. 
Finally, some data concerned with these procedural variables are presented. 

Background 

Theory 

According to El kind, the standard conservation paradigm, which has 
been employed 1n so many developmental investigations, should be viewed as 
*:>sess1ng two concepts rather than one. El kind called these two concepts 
"Identity" and "equivalence." Identity was defined as the child's under- 
standing that simple quantitative properties (e.g., length, weight) of 
single stimulus objects remain constant across irrelevant perceptual trans- 
formations. Equivalence was defined as the child's understanding that the 
equality of pairs of stimulus objects vis-S-vis these same quantitative 
properties 1s not affected by Irrelevant perceptual transformations of one 
of the pair members. Thus, the identity concept is concerned with a 
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quantitative relationship between two successive states of the same object, 
whereas' the equivalence concept is concerned with the quantitative relation- 
ship between two objects during two successive states. 

El kind went on to argue, primarily on logical grounds, that Identity 
should precede equivalence 1n children's thinking. More recently, El kind 
has suggested that Identity normally emerges during the late preschool years 
and equivalence normally emerges during the early elementary school years 
El kind & Schoenfeld, 1972). 

Neither El kind's partitioning of the standard conservation paradigm 
1nt,*) identity conservation and equivalence conservation nor his prediction 
of a developmental lag between the two concepts 1s acknowledged 1n P1aget1an 
theory. Concerning the former point, 1t 1s clear that iu t1ty and equi- 
valence are simply "conservation" from the standpoint of th theory (e.g., 
Piaget, 1968, chapter 2). Concerning the latter point, a key assumption of 
Piaget's stage philosophy of mental growth (e.g., Piaget, 1956, 1960, 1971) 
1s that the emergence of the prototyplc concepts of each stage 1s a syn- 
chronous and unitary process (cf. also Brainerd, 1973a, 1973c; Flavell, 
1971; Flavell & Wohlwlll, 1969; Plnard & Laurendeau, 1969). In the explicit 
case of the conservation concept, 1t 1s assumed that the various "components" 
of the concept emerge in tight synchrony in each quantitative area (e.g., 
Piaget, 1952). 

To avoid subsequent confusion, it 1s important to issue some ter- 
minological caveats at this point. Although Plagetian theory does not 
acknowledge the distinction just discussed, 1t does acknowledge a distinction 
between Identity and conservation. However, El kind's and Piaget's respective 
versions of "Identity" are very different. El kind's Identity quite 
obviously is a quantitative concept, but Piaget's identity is not. Piaget 
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(e.g., 1968, chapter 2) defines Identity as the child's understanding that 
the transformed stimulus is the "same" object as before the transformation. 
In contrast, Elkind's Identity refers to the child's understanding that the 
transformed stimulus is the "same amount" as before the transformation. To 
illustrate 3 consider the length conservation problem. We begin with two pieces 
of string of the same length which are placed side-by-s1de so that their 
equivalence is evident to casual inspection. One of the strings is bent Into 
a circle. If the subject understands that the transformed stimulus 1s still "the 
same piece of string," Piaget's identity concept is present. If the subject under- 
stands that the transformed stimulus is still "the same length," then Elkind's 
Identity concept is present. In Plaget's terminology, however, the latter 
response would simply be "conservation." 

There 1s an important substantive reason for keeping these two versions 
of identity strictly separated. Although Piagetian theory does not acknow- 
ledge a developmental lag between Elkind's identity and equivalence, it does 
acknowledge a lag between Piaget's identity and conservation (Piaget, 1968, 
chapter 2). If one does not distinguish between the two Identities, there- 
fore, 1t is possible to argue that Piaget's predictions are the same as 
Elkind's. In the literature, the two identities are distinguished by 
appending "quantitative" to Elkind's version and "qualitative" to Piaget's 
version. In the present paper, however, whenever the term "identity" is 
employed, we shall understand the referent to be Elkind's version. 
Empirical E vidence 1 

Shortly after Elkind's original paper appeared, Hooper (1969a, 1969b) 
published some data which he viewed as being consistent with the prediction 
that Identity conservation precedes equivalence conservation. Subsequent 
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Investigations by Elklnd and Schoenfeld (1972), Papal ia and Hooper (1971), 
and Schwartz and Scholnick (1970) provided further support for the prediction 
in the content areas of discontinuous quantity, liquid quantity, solid 
continuous quantity, length, and number. Concurrently, however, a series 
of studies was reported 1n which the identity •> equivalence sequence was not 
observed. Koshinsky and Hall (1973) failed to find the sequence for dis- 
continuous quantity using a design that was virtually a po1nt-for-point 
replication of Hooper's (1969a) original study. Moynahan and Gllck (1972) 
failed to find the sequence for weight, number, and liquid quantity. Murray 
(1970) failed to find the sequence for weight and number. Northman and 
Gruen (1970) failed to find the sequence for liquid quantity. 

Somewhat surprisingly, Investigators reporting nonconflrming evidence 
have not used their findings as a basis for challenging Elkind's orglnal 
argument. Instead, the tendency has been to accept Elkind's argument as 
logically sound and to advance the ad hominem argument that, for reasons 
unknown, cognitive development does not conform to the dictates of logic 1n 
the special case of conservation concepts (cf. especially Koshinsky & Hall, 
1973, p. 423; Moynahan & Glick, 1972, p. 251). 

The obvious discrepancy between the supportive and nonsupportive 
groups of studies, coupled with the significance of Elkind's predicted 
sequence from the standpoint of Piagetian theory, prompted the present 
writers to conduct an analysis of the procedural details of the studies. 
Our general aim was to determine whether or not the two groups of studies 
tended to differ on one or more of three variables which are known to affect 
the visibility of asynchronles in the emergence of Piagetian concepts. 

Our preliminary hypothesis was that the Identity -> equivalence 
sequence actually exists in the referent population and, therefore, some 



simple design difference (s) between the supportive and nonsupportive studies 
probably would explain the discrepant findings. Two facts seemed to justify 
entertaining this hypothesis at the outset. First and most important in our 
view, the ^gic underlying Elkind's prediction is ineluctable. Given that 
we accept his operational definitions of identity and equivalence, as the 
authors of all the relevant studies do, the prediction follows Q. E. D.: 
Identity must emerge before equivalence because the operational definition 
of the former is a component of the operational definition of the latter, but 
not conversely. Second, the reverse of the predicted sequence (I.e., equi- 
valence -► identity) has never been observed to the best of our knowledge. 
If the identity •*■ equivalence sequence does not exist in the referent pop- 
ulation and the concepts actually emerge synchronously, then we would expect 
on statistical grounds that the sequence and its reverse would be observed 
with roughly equal frequency. If the predicted sequence does exist, however, 
it is not at all unreasonable that it would be observed in some studies 
and that synchrony would be observed in other studies. In fact, if the 
predicted sequence happens to be a relatively precise one, its visibility 
will be markedly affected by measurement and sampling errors and, hence, 
we would expect just such a pattern of findings. Unfortunately, this 
argument also is ad hominem unless we know what the explicit sources of 
error are. Three possible sources will now be discussed. 

Summary of Analysis 
Our analysis was focused on two potential measurement errors and one 
potential sampling error which have proved troublesome in past research on 
cognitive-developmental sequences: (a) relative task sensitivity; (b) response 
criteria; (c) age of subjects. The results of the analysis are 
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summarized briefly by variable. 
Relative Task Sensitivi ty 

Flavell (1971) has argued convincingly that failure to equate for 
relative task sensitivity 1s a pervasive measureme*.. error 1n cognitive- 
developmental sequence research. According to Flavell, failures to equate 
for relative task sensitivity can mask sequences which exist 1n the population 
and also can manufacture spurious sequences. The argument which supports 
this claim runs as follows. Suppose we have two concepts A and B such that 
A Invariably emerges before 6 during cognitive development. That 1s, there 
is a real A B sequence 1n the population. Suppose we measure A and B 
1n an appropriate sample of subjects using a very Insensitive test of A and 
very sensitive test of B. With these particular tests, the A + B sequence may 
not be observed because there will be a high rate of Type II error ("false 
negatives") on the A test. Now, suppose that A and B emerge synchronously 
during cognitive development and that the same tests are administered to 
the same sample. This time 1t 1s likely that a spurious B A sequence will 
be observed 1n the resulting data. In view of the high false negative rate 
on the A test, a large proportion of those subjects who actually possess 
both A and B will pass the B test and fall the A test but not conversely. 

A recent review of the role of task sensitivity 1n developmental 
studies of transitivity and conservation (Bralnerd, 1973a) provides support 
for FlavelTs argument. When the relative sensitivities of transitivity 
and conservation tests are equated, transitivity is observed to emerge 
before conservation (Bralnerd, 1973a, 1974a, 1974b; Brainerd & Vanden tteuvel , 
1974; Toniolo ft Hooper, 1974). However, when very Insensitive tests of 
transitivity are used in conjunction with reasonably sensitive cests of 
conservation, transitivity either is observed to emerge before conservation 
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(e.g., McManis, 1969; Smedslund, 1963) or 1s observed to emerge synchronously 
with conservation (e.g., Lovell & Ogilvie, 1961). 

Our examination of the Identity and equivalence tests employed 1n the 
supportive and nonsupportive groups of studies revealed no consistent between- 
group differences in relative task sensitivity. Although some differences 
1n this variable were noted in virtually all the studies, there was no evi- 
dence that (a_) blatantly insensitive identity tests had been used in con- 
junction with reasonably sensitive equivalence tests 1n the nonsupportive 
studies or that (b) blatantly Insensitive equivalence tests had been used 
in conjunction with reasonably sensitive Identity tests 1n the supportive 
studies. Either a or I) obviously would tend to explain the discrepancy 
between the two groups of studies. 
Response Criteria 

It has become a commonplace in reviews of neoPiagetian research to 
observe that there 1s an alarming lack of consensus among Investigators 
concerning the appropriate response criteria for inferring the presence of 
Piagetian concepts (e.g., Bell in, 1971; Bralnerd, 1973a, 1973b, 1974c; Hooper, 
Goldman, Storck, & Burke, 1971). Two explicit questions about the response 
criteria employed 1n the Identity/equivalence studies were examined in 
this portion of the analysis. First, 1s there any evidence either that 
(a.) the response criteria for Identity tend to be more stringent than the 
response criteria for equivalence in the nonsupportive studies or that 
(b) the converse tends to be true 1n the supportive studies? It should 
be obvious that would tend to mask a real identity -> equivalence sequence 
and £ would tend to manufacture a spurious sequence. Second, is there any 
evidence either that (cj a response criterion of judgments-plus -explanations 
was used for both identity and equivalence Items in the nonsupportive studies 
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or that (d) a response criterion of judgments-only was used for both identity 
and equivalence Items in the supportive studies? 

Concerning the second question, it has been shown elsewhere on 
theoretical grounds that a judgments-plus-explanations criterion for items 
on Piagetian concept assessments tends to mask real developmental sequences 
in the measured concepts (Brainerd, 1973b, 1974c). From a theoretical 
standpoint, including verbal rationales as an essential component of the 
response criterion Introduces at least two specifiable sources of Type II 
error. Given two concepts A and B such that A actually precedes B during 
cognitive development, the introduction of any source of Type II error will 
tend to reduce the visibility of this sequence. The psychometric rationale 
for this statement is elementary. 

In any given study, inferences about the order of emergence of A and 
B are based on a comparison of the observed frequencies of subjects who 
evidence A in the absence of B and subjects who evidence B 1n the absence of 
A. Subjects who evidence both concepts or neither concept are not of 
interest. If the observed frequencies of A/not-B and not-A/B subjects in a given 
sample differ significantly, then we reject the null hypothesis that the 
two concepts are not acquired in a fixed order (cf. also Brainerd, 1974d). 
Assuming a real A B sequence in the population, the effect of any source 
of Type II error must be to decrease the expected frequency of the. A/not-B 
catngory relative to the expecte d frequency of the not-A/B category and 
thereby reduce the visibility of the sequence: 

If A actually precedes B, then the population consists of only three 
types of subjects— not-A/not-B, A/not-B, and A/B. Assume that the population 
frequencies of these three categories are P 1 , P 2 , and P 3 , respectively, 
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where P ] + ? 2 + P 3 = 1 . Also, assume that our method of assessing A and B 
incorporates a source of Type II error which reduces the probability that 
either concept will be judged present, given that it is present, by some factor 
0 < x < 1. The probability that any not-A/not-B subject will be correctly 
classified is unity. The probability that any A/not-B subject will be 
correctly classified is 1 - x and the probability that he will be incorrectly 
classified as a not-A/not-B 1s x. The probability that any A/B subject 
will be correctly classified 1s 1 - 3x 2 and the probability that he will be 
Incorrectly classified as a not-A/not-B or an A/not-B or a not-A/B 1s x 2 
1n each Instance. Hence, the expected frequencies of the four possible 
subject categories following Type II error are 

E (not-A/not-B) = P ] + xj> 2 + x 2 P 3 [1] 

E (A/not-B) = P 2 [l - x] + x 2 P 3 [2] 

E (not-A/B) - X 2 P 3 and [3] 

E(A/B) « P 3 [l - 3x 2 ]. [4] 

Note that one effect of Type II error 1s to decrease the difference 
between the expected frequencies of the second and third categories, 
relative to the difference between their population values. Given that 
P 2 , P 3 , and x all lie between 0 and 1, the difference between the right 
side of Equation 2 and right side of Equation 3 must be smaller than the 
difference between P 2 and 0. Therefore, the commission *»f Type II errors 
reduces the probability that the null hypothesis mentioned above will be 
correctly rejected. 

By the preceding argument, any response criterion which 1s known to 
Increase the Type II error rate will reduce the visibility of a real A -> B 
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sequence by increasing the observed frequency of the not-A/B category 
relative to the A/not-B category. In line with this argument, the masking 
effect of the judgments-plus-explanations criterion has been demonstrated 
empirically in the case of the well-known number conservation ■> quantity 
conservation sequence tBratnerd & Bra 1 nerd, 1972; Gruen & Vore, 1972). In 
these studies, the developmental precedence of number conservation over 
quantity conservation was much less apparent with judgments-plus-explanations 
than with judgments-only. In the Brainerd and Brainerd study, the dis- 
crepancy was quite large: A binomial test of the null hypothesis that 
subjects who possess only number conservation and subjects who possess only 

quantity conservation occur with equal frequency produced a probability 
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value of 7.58 x 10 with a judgments-only criterion and a probability value 
of 3.79 x 10" 4 with a judgments-plus-explanations criterion. The difference 
between these two values 1s statistically significant. 

To return to the two questions posed at the beginning of this section, 
the answer to both parts of the first question 1s an unqualified "no." In 
our review of the response criteria employed 1n the relevant studies, we 
observed no discernible tendency to employ identity and equivalence 
criteria of differential stringency. Although overall criterion stringency 
differed from one study to another (e.g., compare Koshlnsky & Hall, 1973, 
with Schwartz & Scholnick, 1970), within any single study the typical 
pattern was to choose some criterion a priori and then apply 1t consistently 
to both identity and equivalence responses. On the other hand, the answer 
to both parts of the second question posed above is a qualified "yes." 
Concerning the first half of the question, the supportive evidence reported 
by El kind and Schoenfeld (1972), Paplia and Hooper (1971), and Schwartz and 
Scholnick (1970) involved a judgments-only criterion for both identity 
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and equivalence. Concerning the second half of the question, the non- 
supportive evidence reported by Moynahan and Gllck (1972) and Koshinsky and 
Hall (1973) was based on a judgments-plus-explanations criterion for both 
Identity and equivalence. The previously mentioned consequences of Type II 
errors suggest that the latter procedure Inevitably would reduce the visibility 
of a real Identity -*• equivalence sequence. 

Hooper's (1969a, 1969b) original studies are the only exceptions 
to the rule that supportive studies tended to employ a judgments-only 
criterion/ However, both of these studies provide much less substantial 
evidence for an Identity -*• equivalence sequence than the other supportive 
studies. In the first Hooper study, a total of 108 children between 5 years, 
6 months and 8 years, 6 months were assigned to three different testing 
conditions. Each subject in the first condition was administered either 
of two tests of identity. Each subject 1n the second condition was 
administered either of two tests of equivalence. Each subject 1n the third 
condition was administered either of two tests of equivalence, both of 
which were different from the tests administered 1n the second condition. 
After testing, a judgments-plus-explanations criterion was used to partition 
the subjects in the first condition into conservers and nonconservers of 
Identity, The same criterion was used to partition subjects 1n the second 
and third conditions into conservers and nonconservers of equivalence. 
This procedure permitted a total of 19 ch1-square tests of the hypothesis 
that identity and equivalence tests were of equivalent difficulty. Of 
these 19 possible comparisons, only one [first condition/both tests (N * 36) 
vs. both equivalence conditions/both tests (M - 72)] turned out to be 
significant (X 2 = 5.35, jp_ < ,05). In other words, the null hypothesis could 
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not be rejected i.i 18 of 19 cases. With such a large number of inter- 
dependent significance tests, it 1s obvious that the effective alpha level 
was much higher than the nominal .05, In fact, the probability that one of 
the 19 tests would be significant at or above the .05 level was almost 
unity. 

The second Hooper study also provides somewhat meager support for 
an Identity •* equivalence sequence. A total of 80 subjects were administered 
tests of identity and equivalence. A judgments-plus-explanations criterion 
was used to partition the subjects into conservers of both concepts, con- 
servers of only identity, conservers of only equivalence, and conservers 
of neither concept. Only 11 subjects were classified as members of the 
second group. Because none of the subjects were classified as members 
of the third group, however, the null hypothesis that identity and equi- 
valence tests were of equivalent difficulty was rejected (p_< .05) by the 
binomial test. 

The nonsupportive studies of Northman and Gruen (1970) and Murray 
(1970) are exceptions to the rule that nonsupportive studies tended to 
employ a judgments-plus-cxplanations criterion. In both cases, the subjects 
were classified as conservers or nonconservers of identity and equi- 
valence via a judgments-only criterion. However, the age of the subjects 
employed in these studies was sufficiently above the level at which one 
may reasonably expect to find evidence of an identity ♦ equivalence 
sequence (cf. the following section) that the choice of criterion may well 
have been irrelevant. 
Age of Subjects 

In a recent study, El kind and Schoenfeld (1972) made the following 
observations: identity conservation can be arrived at on the basis of 
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preoperational mental structures whereas equivalence conservation requires 
both the preoperational structures and the concrete operations that develop 
at about the apes of 6 or 7. To solve the identity conservation problems, 
the young child can simply call upon accumulated past experience... Young 
children cannot, however, arrive at equivalence conservation because they 
lack the deductive apparatus of the concrete operational system. Accordingly, 
one should find that young children have identity conservation but not 
equivalence conservation whereas older children should have both ... the 
differential performance on Identity and equivalence tests should hold true 
for preoperational children but not for concrete operational children [p. 530]. 

The practical implication of this argument for developmental studies 
of identity and equivalence is straightforward. If Identity and equivalence 
tests are administered to preschoolers and, perhaps, also to kindergarteners , 
then the identity -> equivalence sequence probably will be observed. On the 
other hand, if identity and equivalence tests are administered to elementary 
schoolers (an age level at which most subjects would be considered concrete- 
operational), then the sequence probably will not be observed. 

Note that it is not necessary to accept either the earlier analysis 
on which Elkind and Schoenfeld's observations are based (Eklkind, 1967) or 
their invocation of Piaget's cognitive-developmental stages to see that the 
measurement Implication of their observations probably is correct. [Con- 
cerning Piaget's stages, for example, it is not at all clear, at least not 
to the present writers, that vaguely defined typological constructs such as 
"preoperational child" and "concrete-operational child" have any explanatory 
power.] Consider two concepts A and B such that A -> B during cognitive 
development. Suppose that A normally emerges during some age range R, and 
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that B normally emerges during some later age ranne R 2> Suppose that tests 
of A and B are administered to a sample from R-j. Assuming errorless measures, 
two and only two types of subjects will be observed: first, most subjects will 
pass A but fail B; second, the remaining subjects will fall both tests. Because 
the null hypothesis tested in such a study is concerned with observed frequencies 
of subjects who pass A but fall B and who pass B but fail A, the data of this 
sample should provide strong support for the A ^ B sequence. On the other 
hand, suppose the tests are administered to a sample from Rg. Again assuming 
errorless measures, two and only two subjects will be observed: first, most 
subjects will pass both tests; second, the remaining subjects will pass A but 
fall B. The chances of rejecting the null hypothesis obviously will be 
much smaller with this second sample than with the first. 

By El kind and Schoenfeld»s argument, studies 1n which partially or 
primarily preschool samples are employed should tend to find an Identity 
equivalence sequence more frequently than studies 1n which only older samples 
are employed. A review of the age ranges of the subject samples employed 
in the supportive and nonsupportive studies provides considerable support 
for this argument. At least some preschoolers were tested 1n three of the 
five supportive studies: One-fourth of Schwartz and Scholnick's (1970) 
subjects were preschoolers; one-third of Papal 1a and Hooper's (1971) 
subjects were preschoolers; one-half of Elklnd and Schoenfeld's (1972) subjects 
were preschoolers, Hooper's studies (1969a, 1969b) are the only supportive 
studies in which no preschoolers were tested. In both studies, the youngest 
subjects were kindergarteners. However, both studies are subject to the 
caveats mentioned earlier. 

In contrast with the subject samples of the supportive studies, pre- 
schoolers were not tested in any of the nonsupportive studies. Murray's 
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(1970) youngest subjects were 6-year-olds; Moynahan and GHck's (1972) 
youngest subjects were 6-year-olds; Kosbtnsky and HalTs (1973) youngest 
subjects were 5-year-olds. Thus, with the exception of the Koshinsky and 
Hall study and the Hooper studies, the lower age bounds of the supportive 
and nonsupportive studies do not overlap. 

New Evidence 

In this section, some new data are reported from an experiment designed 
to examine the major conclusions of the preceding analysis. The design was a 
3X2X2X2 mixed-model analysis of variance. The first factor was age (pre- 
school/kindergarten/third grade). The last three factors were task (identity/ 
equivalence), criterion (judgments-only/judgments-plus-explanations), and 
content (length/weight). The four main effects were not of interest from the 
standpoint of the preceding analysis. Instead, attention was focused on 
the first-order interactions of age with task and task with criterion. From the 
preceding section on age of subjects, it follows that there should be an 
Age X Task interaction such that performance differences between identity 
and equivalence are more pronounced for younger subjects. From the preceding 
section on response criteria, it follows that there should be a Task X 
Criterion interaction such that performance differences between identity 
and equivalence are more pronounced with judgments -only than with judgments- 
plus-explanations. 

Finally, the first-order Interaction of age with criterion was of 
some Interest. An Age X Criterion Interaction is anticipated on the ground 
that the effect of any source of Type II error 1n concept assessment may be 
expected to decrease with age (Flavell & Wohlwill, 1969). 
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Subject s 

The present sample consisted of three age levels: preschool (4-year- 
olds), kindergarten (6-year-olds), and third grade (8-year-olds). A total 
of 60 preschoolers (29 girls and 31 boys), 60 kindergarteners (30 boys and 
30 girls), and 60 third graders (30 girls and 30 boys) were tested. All 
subjects were pretested for their understanding of the relational terms 
employed 1n the concept asse' • ants described below. 
Procedure 

All subjects were administered identity and equivalence tests 1n two 
content areas. The materials used in the four tests were the same as those 
described by Hooper (1969b). The procedure and questions were the same as 
those reported by Bralnerd (1972, 1973a) and the identity ter.s were adapted 
from liooper (1969a). Each of the four tests involved two basic steps: 
prediction and transformation. During the prediction phase, a stimulus 
transformation of some sort was proposed and three questions were posed 
about the quantitative status of the stimulus after transformation (same? 
more? less?). During the transformation phase, a transformation was per- 
formed on one of two quantitatively equivalent stimuli and three questions 
were posed about the posttransformatlon relationship between the two stimuli 
(same? more? less?). During both phases, subjects were asked to explain 
(randomly) one of their three answers. 

Identity tests. During the prediction phase of the length identity 
test, a 28.0-cm string. was placed in the center of the table at which 
the subject and experimenter were seated. The experimenter posed the 
following questions: (a) If I bent this string into a circle, would 1t 
still be the same length as 1t 1s now? (b) If I bent this string Into a 
circle, would it be longer than it is now? (c) If I bent this string into 
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a circle, would it be shorter than it is now? Each subject was asked to 
explain (randomly) one of his prediction phase judgments. During the trans- 
formation phase, the experimenter bent the string into a circle and posed 
three more questions: (a) Is the string still the same length as before? 
(b) Is the string longer now than it was before? (c) Is the string shorter 
now than it was before? Subjects were again asked to explain one of their 
three judgments. The weight identity test was the same as the length 
identity test, except that a clay ball was used as a stimulus and the trans- 
formation involved flattening the ball into a "pancake," 

Equivalence tests. During the prediction phase of the length equi- 
valence test, 28.0-cm strings were used. After the subject had agreed that 
the two strings were the same length, three questions were posed: (aj if 
I bent one of these strings into a circle, would the two strings still be 
the same length? (b) If I bent one of these strings into a circle, would 
one of the strings be longer than the other? (c) If I bent one of these 
strings into a circle, would one of the strings be shorter than the other? 
Subjects explained one of their three judgments. During the transformation 
phase, one of the strings was bent into a circle and three questions were 
posed: (a) Are the two strings still the same length? (b) Is one of the 
strings longer than the other now? (c) Is one of the strings shorter than 
the other now? Again, subjects explained one of their three judgments. 
The weight equivalence tests were the same as the length equivalence tests, 
except that two clay balls were employed and the transformation involved 
flattening one of them into a "pancake." 
Scoring 

In all, each subject made 24 judgments and explained 8 of these 
judgments. The data were scored by both a judgments-only criterion and a 
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and a judgmants-plus-explanations criterion. Concerning the former, subjects 
were given a point for each item on which they emitted a correct judgment. 
Concerning the latter, subjects were given a point on each of the 8 explanation 
Items only if they emitted both a correct Judgment and a correct explanation. 
On the 16 nonexplanatory items, subjects were given a point for each correct 
judgment. As is common in the literature (e.g., Brainerd & Brainerd, 
1972; Hooper, 1969a, 1969b; Koshinsky & Hall, 1973), an explanation was 
considered correct if it fell in one of the usual Genevan categories: addition/ 
subtraction; inversion reversibility; reciprocity reversibility (compensation); 
tautology. 

Results 



Insert Table 1 about here 



Insert Table 2 about here 

A summary of the analysis of variance appears in Table 1. The means 
and standard deviations for the analysis appear by factor and level in Table 
2. Concerning the mean entries in Table 2, the high possible value for each 
cell is 6.0. 

It can be seen in Table 1 that the main effects of age, task, and 
criterion were large and highly significant. Newman-Keuls tests of the ago 
effect indicated that the tasks were more difficult for preschoolers than 
they were for kindergarteners & < .001) or third graders (p_ < .001) and that 
the tasks were more difficult for kindergarteners than they were for third 
graders (p_ < .001 ). Concerning the task effect, the equivalence tasks were 
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more difficult than the identity tasks. Concerning the criterion effect, 
more trials were passed with a judgments-only criterion than with a judgments- 
plus-explanations criterion. 

It also can be seen in Table 1 that the two effects of principal 
interest from the standpoint of our earlier methodological analysis, the 
Age X Task interaction and the Task X Criterion interaction, both were 
observed. Concerning the Age X Task Interaction, Newman-Keuls tests Indicated 
that equivalence tasks were more difficult than Identity tasks for pre^ 
schoolers (p_ < .001) and kindergarteners (p_ < .001) but not for third graders. 
Concerning the Task X Criterion Interaction, Newman-Keuls tests indicated 
that the performance difference between identity and equivalence tasks was 
significantly greater with a jud&..,ents-only criterion than with a 
judgments-plus-explanations criterion (£ < .001). Finally, the anticipated 
Age X Criterion interaction was observed. Newman-Keuls tests Indicated that 
the discrepancy between the two criteria was more pronounced with pre- 
schoolers and with kindergarteners than with third graders (£ < .01 and £< 
.025, respectively). The preschoolers and kindergarteners did not differ. 

Conclusions 

Both our earlier methodological arguments and the findings just 
reported tend to substantiate the preliminary hypothesis with which we began. 
The Age X Task Interaction Indicates that the identity > equivalence 
sequence 1s less visible in older subjects and, hence, the ages of the 
subject samples employed 1n the nonsupportive studies militated against 
finding the sequence. Similarly, the Task X Criterion interaction Indicates 
that the identity ♦ equivalence sequence 1s less visible with a judgments- 
plus-explanations criterion and, hence, the response criteria employed In 
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some of the nonsupporttve studies militated against finding the sequence. 
Therefore, it seems reasonable to conclude that identity conservation does 
indeed precede equivalence conservation. It also seems reasonable to 
conclude that existing failures to find the sequence are best viewed as 
consequences of having committed measurement and sampling errors. 

In closing, we should like to dwell briefly on the general signi- 
ficance of the measurement and sampling errors discussed 1n the review 
portion of this paper for concept development research. These error 
sources obviously are relevant to all studies concerned with the order of 
emergence of concepts, not just to studies of Identity and equivalence. 
If these error sources are not carefully controlled In such studies, then 
discrepant findings of the sort we have just considered will be the 
Inevitable result. This will no doubt seem a trivially obvious point to 
many readers. However, the point is worth making because cognitive-develop- 
mental investigators do not generally acknowledge 1t, In the concept develop- 
ment literature, the effects of measurement and sampling errors, even ones as 
gross as those reviewed earlier, are only rarely discussed (e.g., cf. 
Flavell, 1971; Brainerd, 1973a, 1974d). To illustrate, the elementary 
question of the general consequences of Type I and Type II measurement errors 
on the null hypotheses tested in concept development studies ha$ never 
been systematically examined (Brainerd, 1974d). Many Investigators who 
study the order of emergence of concepts 1n children's thinking appear to 
be laboring under the misapprehension that the sequences (or concurrences) 
which they study are so robust that rigorous control of measurement and 
sampling error 1s unnecessary. It 1s to be hoped that the present demon- 
stration of the pronounced effects of such errors on the Identity •> equi- 
valence sequence will help dispel this misapprehension. 
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Footnote 

*We have confined this review to published experiments in which normal 
children were studied, pilot studies, unpublished papers, unpublished theses, 
and studies employing special populations (e.g., retardates) have not been 
included. Unpublished experiments were excluded because they pose the problem 
of reader accessability. Studies of special populations were excluded 
because they pose some obvious interpretational problems, (e.g., if the 
identity -> equivalence Fjquence fails to show up in retardates, what, if 
anything, does this tell us about normal cognitive development?) A complete 
bibliography 0 f pilot studies, major unpublished experiments, and studies 
employing special populations is available from the authors. 
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TABLE 1 

Summary of Age X Task X Content X Criterion Analysis of Variance. 



Source 



df 



MS 



Age (A) 

Error betv/een 
Task (B) 
Content (C) 
Criterion (D) 
Ax B 
A x C 
A x 0 
B x C 
B x D 
C x D 
A X B x C 
A x B x D 
A x C x D 
B x C x 0 
A x B x C x D 



2 

177 
1 
1 
1 
2 
2 
2 
1 
1 
1 
2 
2 
2 
1 
2 



9572.15 

124.84 
498.34 
3.34 
106.57 
53. 4C 
0.56 
4.18 
0.01 
3.68 
0.43 
0.06 
0.19 
0.32 
0.00 
0.11 



76.68*** 

92.30*** 
<1 

118.02*** 
9.89*** 
<1 

4.63* 
<1 

45.37*** 
5.30** 
<1 

2.37 
<1 
<1 
<1 



Error withi^: 
A x B 
A x C 
A x D 
A x B x C 
A x B x D 
A x C x D 
A x B x C x D 



*£<.05 

**£,;. 01 



177 
177 
177 
177 
177 
177 
177 



5.40 
11.79 
0.90 
1.05 
0.08 
0.03 
0.15 



*** 



£<.0001 
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TABLE 2 

Summary Statistics for Age X Task X Content X Criterion Analysis of Variance. 



Length Weight 

Age 





Identity 


Equivalence 


Identity 


Equivalence 


Judgments-only 


Preschool 










M 


2.40 


1.18 


2.37 


1.13 


SD 


1.25 


1.28 


0.97 


1.13 


Kindergarden 










M 


3.52 


2.28 


3.63 


2.28 


SD 


1.78 


2.23 


1.85 


2.34 


Third grade 










M 


5.03 


4.58 


4.92 


4.55 


SD 


1.59 


2.42 


1.83 


2.47 


Judgments-plus-explanations 


Preschool 










M 


1.73 


0.40 


1.60 


0.82 


SD 


1.22 


1.12 


1.03 


0.97 


Kindergardan 










M 


3.10 


2.05 


3.03 


1.93 


SD 


1.77 


2.10 


1.91 


2.20 


Third grade 










M 


4.68 


4.52 


4.55 


4.35 


SD 


2.06 


2.44 


2.01 


2.51 



